Coded Shotgun Sequencing

نویسندگان

چکیده

Most DNA sequencing technologies are based on the shotgun paradigm: many short reads obtained from random unknown locations in sequence. A fundamental question, Motahari et al. , (2013), is what read length and coverage depth (i.e., total number of reads) needed to guarantee reliable sequence reconstruction. Motivated by DNA-based storage, we study coded version this problem; i.e., scenario where molecule being sequenced a codeword predefined codebook. Our main result an exact characterization capacity resulting xmlns:xlink="http://www.w3.org/1999/xlink">shotgun channel as function depth. In particular, our results imply that, while uncoded case, $O(n)$ greater than notation="LaTeX">$2 \log n$ for reconstruction length- notation="LaTeX">$n$ binary sequence, only notation="LaTeX">$O(n/\log n)$ notation="LaTeX">$\log be arbitrarily close 1.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Human whole-genome shotgun sequencing.

Large-scale sequencing of the human genome is now under way (Boguski et al. 1996; Marshall and Pennisi 1996). Although at the beginning of the Genome Project, many doubted the scientific value of sequencing the entire human genome, these doubts have evaporated almost entirely (Gibbs 1995; Olson 1995). Primary reasons for generating the human genomic sequence are listed in Table 1. The approach ...

متن کامل

Shotgun protein sequencing with meta-contig assembly.

Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at l...

متن کامل

Solving Repeat Problems in Shotgun Sequencing

Shotgun sequencing is the most powerful strategy for large scale sequencing. Two main approaches exist: clone-by-clone and whole genome shotgun (WGS). In the clone-by-clone strategy, overlapping clones are amplified and then sheared in a random fashion. In the WGS approach, a sufficient amount of cells from the target organism are obtained, and the random shearing is performed on extracted DNA....

متن کامل

An Algorithm for Whole Genome Shotgun Sequencing

The push to sequence the entire human genome is gearing up [1]. Recently there has been disagreement within the genome community as to the best approach for sequencing the human genome. While many scientists believe that clone mapping is the best solution, others argue that whole genome shotgun sequencing is a cheaper and faster alternative. Arguments for [2] and against [3] whole genome shotgu...

متن کامل

OTU Analysis Using Metagenomic Shotgun Sequencing Data

Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true struc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE journal on selected areas in information theory

سال: 2022

ISSN: ['2641-8770']

DOI: https://doi.org/10.1109/jsait.2022.3151737